{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "d8ce023d",
   "metadata": {},
   "source": [
    "# 数组的读写\n",
    "\n",
    "数组支持方便的读写操作。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "71999002",
   "metadata": {},
   "source": [
    "## 数组的读取"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "d40c256e",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b86848b2",
   "metadata": {},
   "source": [
    "可以用函数np.loadtxt()从文本文件中读取数据。假设有这样的一个文件myfile.txt，内容为："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "95ad9d9f",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Writing myfile.txt\n"
     ]
    }
   ],
   "source": [
    "%%writefile myfile.txt\n",
    "2.1 2.3 3.2 1.3 3.1\n",
    "6.1 3.1 4.2 2.3 1.8"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "055281d2",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[2.1, 2.3, 3.2, 1.3, 3.1],\n",
       "       [6.1, 3.1, 4.2, 2.3, 1.8]])"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.loadtxt('myfile.txt')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4a5ad5c5",
   "metadata": {},
   "source": [
    "默认文件中的数据是空格分割，如果分割符不是空格，则可以通过参数指定："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "75902cde",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Overwriting myfile.txt\n"
     ]
    }
   ],
   "source": [
    "%%writefile myfile.txt\n",
    "2.1, 2.3, 3.2, 1.3, 3.1\n",
    "6.1, 3.1, 4.2, 2.3, 1.8"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "04cc11f7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[2.1, 2.3, 3.2, 1.3, 3.1],\n",
       "       [6.1, 3.1, 4.2, 2.3, 1.8]])"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.loadtxt('myfile.txt', delimiter=',')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "37f9c004",
   "metadata": {},
   "source": [
    "## 数组的二进制读写"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6c1d3d51",
   "metadata": {},
   "source": [
    "直接读写文本方式不如使用二进制读写速度快效率高。可以将数组存储成二进制格式，并进行读取。\n",
    "\n",
    "用于保存数组的函数有：\n",
    "- `np.save(file_name, arr)`：保存单个数组，.npy格式。\n",
    "- `np.savez(file_name, *args, **kwds)`：保存多个数组，无压缩的.npz格式。\n",
    "用于读取数组的函数为：\n",
    "- `np.load(file_name)`：对于.npy文件，返回保存的数组；对于.npz文件，返回一个名称-数组对组成的字典。\n",
    "\n",
    "考虑以下两个数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "323405a8",
   "metadata": {},
   "outputs": [],
   "source": [
    "a = np.ones((2, 3))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "aca9e4a1",
   "metadata": {},
   "outputs": [],
   "source": [
    "b = np.zeros(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cc1fb430",
   "metadata": {},
   "source": [
    "单个数组可以用`np.save()`保存，并用`np.load()`函数读取，读取后，返回一个数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "733f2415",
   "metadata": {},
   "outputs": [],
   "source": [
    "np.save(\"data\", a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "7d06a7dd",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1., 1., 1.],\n",
       "       [1., 1., 1.]])"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.load(\"data.npy\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "66e37de0",
   "metadata": {},
   "source": [
    "多个数组可以用`np.savez()`和`np.load()`函数读写。`np.savez()`函数支持两种模式保存多个数组，第一种是`*arg`模式："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "ea122ac8",
   "metadata": {},
   "outputs": [],
   "source": [
    "np.savez(\"dataall\", a, b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "d6a4f573",
   "metadata": {},
   "outputs": [],
   "source": [
    "data = np.load(\"dataall.npz\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "28ecd0ed",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<numpy.lib.npyio.NpzFile at 0x10e582790>"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b2fed65a",
   "metadata": {},
   "source": [
    "当`np.load()`函数读取保存的多个数组时，函数会返回一个字典。在`*arg`模式下，这些数组对应的键会被自动命名为`arr_数字`的形式，其中的数字按照保存的顺序，从0开始，因此，用键`arr_0`索引得到的数组是a，`arr_1`索引得到的数组是b："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "072f4507",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "arr_0 [[1. 1. 1.]\n",
      " [1. 1. 1.]]\n",
      "arr_1 [0. 0. 0.]\n"
     ]
    }
   ],
   "source": [
    "for k in data:\n",
    "    print(k, data[k])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0b37c863",
   "metadata": {},
   "source": [
    "第二种是`**kwds`模式："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "73975f1e",
   "metadata": {},
   "outputs": [],
   "source": [
    "np.savez(\"dataall\", x=a, y=b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "13976842",
   "metadata": {},
   "outputs": [],
   "source": [
    "data = np.load(\"dataall.npz\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5d3f9a72",
   "metadata": {},
   "source": [
    "此时索引的字典是保存时传入的参数："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "83b93354",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x [[1. 1. 1.]\n",
      " [1. 1. 1.]]\n",
      "y [0. 0. 0.]\n"
     ]
    }
   ],
   "source": [
    "for k in data:\n",
    "    print(k, data[k])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "28b5879d",
   "metadata": {},
   "source": [
    "删除生成的文件："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "21793b08",
   "metadata": {},
   "outputs": [],
   "source": [
    "%rm myfile.txt data.npy dataall.npz"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
